SECOND MOMENT METHOD FOR A FAMILY OF BOOLEAN CSP 

YACINE BOUFKHAD AND OLIVIER DUBOIS 

Abstract. The estimation of pliase transitions in random boolean Constraint Satisfaction Prob- 
lems (CSP) is based on two fundamental tools: the first and second moment methods. While 
the first moment method on the number of solutions permits to compute upper bounds on any 
boolean CSP, the second moment method used for computing lower bounds proves to be more 
tricky and in most cases gives only the trivial lower bound 0. In this paper, we define a subclass 
^1^ of boolean CSP covering the monotone versions of many known NP-Complete boolean CSPs. We 

U give a method for computing non trivial lower bounds for any member of this subclass. This is 

KyJ achieved thanks to an application of the second moment method to some selected solutions called 

characteristic solutions that depend on the boolean CSP considered. Wc apply this method with 
a finer analysis to establish that the threshold rj. (ratio : #constrains/#:variables) of monotone 
1-in-k-SAT is logfc/fc < rfe < log^ k/k. 
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^ Introduction 

^ The empirical evidence has shown that random instances of boolean Constraint Satisfaction Prob- 

~~' lems CSP exhibit a phase transition i.e. a sudden change from SAT to UNSAT when the number 

of constraints increases: that is there exists a critical value r* of the ratio r number of constraints 
^ to number of variables such that random instances are w.h.p. satisfiable if r < r* and w.h.p. 

C"~~- unsatisfiable if r > r* . r* is called the threshold value of the transition. The sharpness of the 

f^ threshold has been addressed in a series of works [12l Ull O [6] . 

Computing the threshold associated to a CSP is at present out of reach apart from some exceptions 
[31 [131 [71 [TT] for polynomial subclasses. Since this cannot be carried out, upper and lower bounds of 

}^— ^ r* are computed. These bounds are almost all obtained by different applications of the probabilistic 

method tools: the first and second moment methods and for some of them through anlaysis of 
algorithms[10l[9l[8l[l6l[2] for 3-SAT. 

While the first moment method on the number of solutions permits to obtain an upper bound of 
the location of the threshold for any boolean CSP, the second moment method on the number of 
solutions fails at any ratio to estimate the probability of satisfiability. In [T] , an original method 
H is presented to overcome this problem in the case of fc-SAT for which the direct calculus also fails. 

We define a subclass of CSP and a method that allow to bound the phase transition from both 
sides for this subclass. The latter is characterized by constraints having the property of being 
closed under permutations. It includes the monotone versions of many well known problems like : 
1-in-kSAT, NAE-k-SAT... and then it includes many NP-Complete boolean CS'Ps. 
Roughly speaking, we show how the second method can be made "to work" for every problem 
in this class. More precisely, we parameterize the valuations by their number of variables having 
the value 1 and we show that there exist precise values for this parameter depending on every 
problem for which the second moment method gives a non trivial lower bound, the corresponding 
solutions are called characteristic solutions. We prove that the bound given by this method is at 
least some well defined value for any problem in the sublclass. However, the generality of this value 
is obtained at the cost of some weakness. Better bounds can be computed using the same scheme 
through a finer analysis on a case by case basis. To illustrate this, we do the full analysis for 
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positive l-in-Zc-SAT to derive the asymptotically optimal lower bound with respect to our method 
that is log fc/fc. To show that this lower bound is tight, we establish using the first moment method 
an upper bound of log k/k for k > 7. 

1. Basic definitions and main results 

Given a set X oi n boolean variables, a valuation tr is a mapping X -> {0, 1} that assigns to 
any variable x G X the value or 1. k being an integer, a relation R of arity fc is a subset of 
{0,1}'^. A relation is said to be trivial if (0,...,0) or (1,...,1) is an element of R. We consider 
throughout the paper only non trivial relations. A constraint defined from a relation R of arity k, 
is a tuple of k boolean variables, denoted R{xi, ...,Xk), which is said to be satisfied under some 
valuation a iff {a (xi) , ..., a (xk)) G R, otherwise it is said unsatisfied. Given a set of relations S, an 
instance of boolean CSP with respect to S, denoted by CSP (S), is a conjunction set of constraints 
R {xi, ..., Xk) where R G S. An instance CSP (S) is satisfied iff every constraint is satisfied. 
In this paper, we consider a subclass of CSP {S)s defined as follows. A relation R is said to be 
invariant by permutation, iff any permutation of the coordinates of a tuple i e i? is also in R. Such 
a relation is denoted by Rinv The invariance property implies that for every tuple t € Rinv, all 
tuples having the same number of coordinates equal to 1 or (Os) as t must be also in Rinv ■ This 
defines an equivalence relation, two elements t and t' of Rmv belonging to the same equivalence 
class iff they have the same number of coordinates equal to Is (or Os). Thus the equivalence classes 
partition Rmv into subsets each associated to an integer i equal to the number of Is of the elements 
of the class. 

In this paper, we will only consider boolean CSPs with respect to a single non trivial invari- 
ant under permutation relation denoted CSP{{Rinv}) ■ In order to designate more explicitly a 
CSP{Rinv) we will denote it in an equivalent manner by CSP {Ik), where lu is the subset of inte- 
gers in {1, ..., k — 1} associated to all equivalence classes. Thus an instance CSP{Ik) is satisfiable 
with respect to Ik iff there exits a valuation a such that the number of Is in every constraint of 
CS'P(/fcr|is an element of Ik- 

Example 1. A: = 4 and i? = {1000,0100,0010,0001,0111,1011,1101,1110}. R is invariant by 
permutation. The set of integers associated to R is I4 = {1,3}. A constraint {xi-^,Xi2,Xi^,Xi^) of 
an instance CSP{R) is satisfied iff exactly one or three of the four variables of the constraint has 
the value 1. 

The CSPs of the class defined above are NP-complete for any relation of arity greater or equal to 3 
according to the Schaefer classification. They include two well known problems of this classification 
that are positive 1-in-fc-SAT (Ik = {1} according to the above definition) and positive not-all-equal- 
A:-SAT(4 = {l,...,fc-l}). 

The random version of a CSP{Ik) is as follows. Given a relation Rinv, Ik the set of integers 
associated with Rinv, a random CSP{Ik) instance with m constraints over n boolean variables is 
formed by drawing uniformly, independently and with replacement m tuples of k variables over 
the set of n variables. Such a random CSP{Ik) instance is denoted by Ik(jn,n). This defines a 
probability space denoted by Q{Ik,m,n) in which instances Ik{m,n) are equiprobable. 

Definition 1. A p- valuation for some natural integer < p < 71 is a valuation such that 
|{a;,|cr(a::,) = 1}| = p. 

Let S G [0, 1], for the sake of simplicity we will denote whenever it is non ambiguous a [5nJ -valuation 
by ^-valuation. A (5- valuation that is a solution of an instance /^(m., n) is said to be a (5-solution. 
Let Xg be the random variable associating to each Ikiin, n) the number of its (5-solutions. 



Alternatively, this class can be seen as hypergraph bi-coloring problem where the number of vertices allowed to 
have a certain color in some edge are taken only in I^. . 
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Theorem 1. For any CSP {Ik), there exist a r}^ > and < <5 < 1 such that for all r < r}^, 

Roughly speaking, the preceding Theorem states that for any problem in CSP (Ik), there exists 
a S that makes the second moment to succeed in computing a lower bound. Combined with the 
inequality of Cauchy-Schwartz: 



nxs] 
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and the sharpness of the threshold of the problems in this class [13] [4], we have the following 
consequence of Theorem [T] 

Corollary 1. For any CSP (Ik), there exists a real rj > such that for r < r*j , 
lirrin^aoPT {Ik {n, rn) is satisfiable ) = 1. 

The Theorem [l] states that the second moment succeeds for some values of S using Xg as a random 



variable. However, the value of the bound r^ mentioned in the theorem (and given in Section 2.2 ) 
is not the optimal bound that can be derived by the method. This is the price of its generality, i'he 
full analysis establishing the optimal bound can somehow be done on a case by case basis. Note 
that the best bound that can be obtained can not exceed the smallest ratio that make £[^5] — > 
when n — ?> 00. Indeed, by Markov inequality the probability of Xg > in that case is 0. The value 
of the best bound that can be expected for positive 1-in-k-SAT is asymptotically log k/k. This is 
precisely what we obtain through the full analysis of this particular problem. We obtain : 

Theorem 2. Let Ik = {1}. ?im„^ooPr (Ifc {n,rn) is satisfiable) — 1 If r < \ogk/k and fc > 3. 

Specifically, for fc = 3 a better lower bound at 0.546 has been computed in |15j analyzing the 
success to find out a solution with a specific algorithm. However our aim is to provide a tool 
yielding systematically a lower bound for a large class of CSPs. We give a rough upper bound at 
(logfc)^/fc (valid for fc > 7) which shows that our bounds are tight around the threshold. 

2. Second moment and characteristic solutions 
We first define the characteristic solutions before we give the second moment of their number. 

Definition 2. Characteristic valuations for some CSP {Ik) are (5- valuations for which the proba- 
bility of satisfying a uniformly randomly drawn constraint is locally maximum with respect to 6. 
The solutions of an instance that are characteristic valuations are said to be characteristic solutions 
for this instance. 

Given some ^-valuation, the probability tt^ {5) that a randomly selected fc-tuple contains i ones is 
TTj ((5) = ( .)(5* (1 — S) ~\ So the probability that a (5- valuation satisfies with respect to some set 
Ik C {1, 2, ..., fc — 1} a randomly selected fc-tuple is obtained by summing up, the mutually exclusive 
cases for different i's in /. This probability is gi^{S) = J2i£i ""» (^) = J2iei {T)^^ i'^ ~ ^) "'■ ^^^ 
A/j, be the set of reals for which g/^ {S) is locally maximum. Clearly, for any S G A/^ , (5- valuations 
are by definition the characteristic valuations of CSP {Ik)- 

Since Ik C {1, ..., fc — 1} then g/^ (0) = g/^ (1) — 0. The function gj,^ {6) being smooth, strictly 
positive inside ]0, 1[, it maximizes inside the interval ]0, 1[ at at least one stationary point. Thus 
for any Ik C {l,...,fc— 1}, Aj^ ^ 0. The fact that every S G Aj^ is a stationary point for gj^ {6) 
will be used later. 
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2.1. First and second moment of number of characteristic solutions. The key idea used 
in the method presented in this paper is instead of taking as a random variable the number of 
solutions, to consider as random variable the number of (5-solutions. The first moment of X^ is: 



Where: 



\onJ ^27r(5(l - 5)n 



{i-sf- 



Remark 1. It is easy to see that lim„^oo lik-r i^) = for any r > f/^. ^ = °^ jj K^ -. By 

Markov inequality, this means that the (5-solutions does not exist for r > r/^. ,5 and then the lower 
bound that we can get through (5-solutions is at most r/^,5. 

For computing the second moment, we consider two (5- valuations ai and (T2 having p variables 
assigned 1 in cti and in (T2. This defines all the other categories of variables. Indeed, the number 
of variables assigned in cti and 1 in (T2 must be also p in order that (T2 is a (5- valuation. [(5nJ — p 
is the number of variables assigned 1 in both solutions and n — [(5nJ — p are assigned in both 

solutions. First, we give the probability <j>i.j,s ( kct ) that a random fc-tuple has i Is under cri and 
j Is under (T2 such that d variables of the fc-tuple are equal to 1 in both assignments, d must range 
from drain — max(0, i + j — k) to dmax — min(i, j). 

fp\ 1^Yfc\ / A /fc - z\ /^H^py/py+^-2rf^n- L<5nj-P' '"'"'+' 

(^)^-HiHJ ^ ^U[d)[j-d)[-^^) U 

Summing up over couples (i,j), we get Gi^^s{fi), the probability that a couple of (5-valuations 
having ji5n variables taking a different value in cti or (T2 satisfies a random constraint: 

We can now write the second moment by summing up over all possible couples (ai, (T2) 

■min{\_Sn\.7i—\_S7i\) , \ / \ f^n 

^[^'1 = ^^ [{[dn\ -p) pp (n- [Sn\ p)) ^^-' [W\ 

We now estimate E[X|] as a function of n, using a classical asymptotic estimate of the multinomial 
coefficient. For small multinomial numbers the asymptotic estimate being also an upper bound, it 
will be sufiicient for the estimation we need. We set : /i — r^- 

(1 - yu) (5n fiSn ^i5n (1- S - fiS)nJ y^2fid{l - fi)S{l - S ~ fiS) 
where : tg (fi) = ((1 - ^i) S)^^'"^^ {nSf"^ C^ - S - ^(5)'""'"^^ We have : 

with: 
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Each term in ([2]) consisting of a polynomial factor and an exponential factor in n the sum can be 
estimated with a discrete version of Laplace method. Thus : 

Lemma 1. if u and v are smooth real-valued functions of one variable x, and if v has a single 
maximum on [a, +cxd[, located at ^gwith ^ > a and if further w"(^o) *s not 0, then : 

7^ ^E^"(^)^^^(™(^)) - 5(^");^p^^-p("«(^o)) 



We will apply Lemma 1 setting v{^) = log(r/j. .5,. (/i)) and w(/i) = a 12/1 ^ n^i a tt- Then 



Y^2/]^527T^r^iy(i^^5^^/ii5) ■ 



.21 -1 (27r)-3/2 y/2TTmin{5,{l-5)) 



V2(1-'5)3J3 y^in;.5,,(l-'5)| V^e[0'™"(i-^)1 / 

The success of the second moment method relies mainly on the behavior of Tii^s,r (m) for /i G 

[0,mm(l,Y)] 

The upper bound of the ratio ^r-ZJ, will then depend mainly on its exponential part ^^^ ^ t-^ 



E\X, 



that must be equal to 1 otherwise, all what we will get is the trivial relation 'ryji > 0. We will 
see in the next Section that this achieved through characteristic solutions. 
From (fTl) we will write in the sequel of the paper : 



min(i,j) , , , , , 



(5) 0m,5(m) = 2^ [i)\d)\j-d)''''^'''^^^^ 

d=max(0,i+j~k) \ / \ / \-^ / 

(6) with : K,,,.d.5 (a*) - ((1 - m) 5)" ifidy+'-"' {{1-6- (15))"-'-'+" 
and : 

(7) Gh,s{f^) = XI XI '?^»J^'5 (^) 

ie/fc j&ik 

2.2. Proof of Theorem [l] and its Corollary. In the following, we sketch first the proof by 
discussing its most important ingredients. A crucial point for the success of the method is the 
point where /z = 1 — 5 or the independence point. To understand this, consider two valuations 
drawn independently uniformly at random from the set of ^-valuations. A variable is assigned 1 
under one of the two ^-valuations with probability S and with probability 1 — 6. Since the two 
valuations are selected independently, the probability of being assigned 1 by a (5- valuation and by 



the other is 6{1 — 6). Thus, according to the notation of the Section 2.1 these pairs of ^-valuations 
are characterized by Hamming distance 2/i with fi — 1 — 6. These uncorrelated pairs of 5- valuations 
play a central role in the success of the method. Indeed: 
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Gi,,5{l-6) = V V 0.,,,^ (1 - 5) 



i&Ik j&Ik 



y: e ^'^' (1 - ^)''" 

JG/fc jG/fc 
\2 



:)G 



It is easy to see that ts (1 — (5) = ( (5* (1 — (5) ^ 1 , thus : 



s2r 



(R\ r n Ji\ Gh,sil-S) gi^ {5) 



(^6^(1-6)' 



\2 



Since F/^^^^r (1 — <^) Hiu,r (i^) = 1, if /-t = 1 — (5 is not the global maximum of T/^. ,5^^ (m)i there exist 

some /i for which 7/^^ ((5) /T i^^s,r (m) < 1 making the method to fail. 

Consequently, a necessary condition for the success of the method is that /z = 1 — (5 is a stationary 

point. Lemma [2] states that this is the case only for the characteristic solutions. 

Let p = max^gfo,„„(,^i-.)j (^ + ^ + j^^), ^ = '^^^^e[o.rmn(i,^^)] (log(G/.,5 (m)))"- Let: 

(9) ^l - - 

The Lemma [31 states that r*j is well defined and that it is strictly positive and for any r < r*j , the 
second derivative of log {Ti^^s.r (m)) is negative for any /i € [0, min (1, (1 — 5)/ 5)]. This function is 
then concave in the previous interval. Combining the two lemmas, we can conclude that there is 
a range of ratios ]0, r| [ for which /i = 1 — (5 is the global maximum of Ti^s.r- 

Remark 2. In general, the point ^l — \ — 5 continues to be the global maximum in a range beyond 
r*j after the function ceases to be concave, allowing through a more precise analysis to get better 
lower bound than r*j . However, a general bound beyond concavity is hard to figure out for the 
class and we do not need this fact for the proof of Theorem [l] which aim is to give the conditions 
under which the second moment succeeds regardless of the value of the bound obtained. When 
one needs for a particular problem to compute the best lower bound with respect to J-solutions, a 
finer analysis is required for this particular problem. This is what we do to get the best possible 
lower bound with respect to (5-solutions for positive l-in-ft-SAT. 

Lemma 2. F^^ ,^ [1 - 5) = Q iff 5 ^ Aj, . 

Proof. Considering (pi), it is easy to check that the derivative of t'^ (/i) (defined in ([6])) is such that 
t'g {1 — 5) — 0. It is then necessary and sufficient that G'j g{l — 5) — 0. It can be shown (see 
Appendix [A| , that: 

which is equal to iff gi^ [5) — i.e. 5 G A/^. . D 
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Lemma 3. r}^ (as defined in lEy) is strictly greater than and for every r < r}^, log {Tj^^s,r (/^)) < 
/or ^ e [0, mm (1, (1 - (5)/<5)V 



+ r(log(G,„,(A*)))" 





f ^ 


n 


(iog(r,„,,.(M)))" = 








\ 1 - /^ 


M 



l-S-Sfi 



It can be shown (sec Appendix B ) that ( — -^ — — j^_^_^- I is negative and bounded from above 

by — p and that (log {Gii^,s (p-))) is positive and bounded from above by v, then (log (F/^. .5 ,. (/i))) < 
—p + ri^ < if r < p/v = r*j . D 

Now we arc in position to give the proof of Theorem [Tl 
Proof of Thorem [l] 

Thanks to Lemma ^ we know that /i = 1 — (5 is a stationary point for log(r/j. ,5^^ (m)) ^i^d thanks 
to Lemmapl we know that log(r/j._5.r {p)) is concave for r < r*j . Combining these two facts, we 
deduce that p = 1 — 8 is & global maximum for logiTj^^s.r (a*))- The inequality (HI) becomes: 

On putting Ci = f"^''^' X ^^'^Mm-S)) ^^ ^^^- shanks to U Tn sr {1 - S) = 7/, ,(5)2, 
this yields : E[X|] < n'^Ci{"fi,.riS)^"). Allowing for the relation ElXg] > , '\'' , (7/,., (5))", 

■v/27r(5(l— ^)n 

we deduce : 

C2 being a positive constant. Thus Theorem [l] is proved. 

From the Creignou and Daude criterion 5, 6 for fc > 3 a CSP{Ik) is neither depending on one 
component nor strongly depending on a 2XOR-relation, it can be stated according to the Friedgut's 
theorem in T^ the following fact : 

Fact 1. For every k>3 and a random CSP{Ik), there exists a function Xk{n) such that for any 
e>0: 

lim Pr{Ik{rn,n) is satisfiable) = { , / x/-, n 

n^oo 10 ifr<\k{n)[l — e) 

It follows that for any CSP (Ik) , if r < r|^ as defined ([9|, then : 
lim „_j.oo Pr (/fc {n,rn) is satisfiable ) = 1. Thus Corollary fl] is proved. 



fc-i 



3. Positive l-m-fc-SAT case: proof of Theorem [2] 

For 1-in-fc-SAT, we denote the corresponding Ik by Ifc = {!}. The function gi^. (S) = kS {1 — S) 
It is easy to check that Ai^ = "Tl}- We note first that the best lower bound that we can hope to get 

is ^ifc.i/fe as defined in Remark 1 It is easy to check that limfe_>oo io'"k/k ~ ^- Then asymptotically, 

the best lower bound that can oe obtained with respect to l/Zc-solutions is logfc/fc. 

For the second moment, as previously we consider only (5- valuations. Only the function Gi^^i/k (m) 

changes: 

Gu,i/k itJ-) = k{k-l) His,iA/k itJ-) + fc'«i4,o,i/fe (a*) = ^^~'' {k-l- p) ^'^ {k{l- p + p^) - l) 
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Thanks to Lemma pi we know that /i = 1 — 1/fc is stationary point of T^^ i ^ {jjl) and that 

Tj^j, i r (1 ~ 1/^) — lik.r (1/fc) ■ It remains to prove that it is a global maximum for r < logk/k. 
This bound goes in general beyond concavity so we need a finer analysis. That is the purpose of 
this Lemma. 

Lemma 1. For any fi e [0, 1], F^^ i ,, {fi) < ji^.r (1/fc) • 

Proof. We give here just an outline of the proof. A detailed one is given in the Appendix O 
The interval of fi is divided into two parts [0, 1/2] and [1/2, 1] where the function F, i is bounded 
from above using two different techniques. 

First for /i e [0, 1/2]: In this interval, we use mainly the fact that for some a E [0, 1/2], 1 — /i — /_t^ < 
la = 1 — a — a^ for any ^ € [a, 1/2]. Let 

Ta {^l) = log k log (fcl-'^ (fc - 1 - ^lf-^ {kla - 1)) - fc log t,/, {^l) 

Ta in) bounds from above k logF, i logfc (fi) in the interval [a, 1/2]. It can be shown that r^ (/x) 

'^ ' fc ' fc 

is strictly increasing in the above interval. Beginning with gq = 0, we find a value ai such that 
Tag (oi) < 2klogji^,r (1/fc) proviug the desired inequality for ^ e [ao,ai]. We repeat the same 
with Tai and find a 02 and so on... until an a^ > 1/2 which finishes this part of the proof. In fact, 
only two steps are sufficient with ai — 0.15. 
Second for fi G [1/2,1]: Recall that /clogF^ 1 logk (/i) ~ \ogk logG^^ 1 {fi) — fclogti/fc (/x). We 

prove first separately that the derivatives of log Gi 1 (/i) and — log ti/j. (/z) are concave in the whole 

fc - ^ / 

considered interval. Then we split the above interval in two parts ]l/2, 1 — l/fc[ and ]1 — 1/fc, 1]. 
Considering their concavity, both functions can be bound from below in the first interval by the 
linear functions representing the chords joining the two points corresponding to the two bounds of 
the interval. The sum of these two linear functions being positive, this proves that the derivative 

of fclogF, 1 logfc (/i) is positive in the first interval and then that the value of the function at 

fc ' fc ' fc 

/i = 1 — 1/fc is maximum. For the second interval, the functions are bounded from above by 
the linear functions representing the tangent lines at /i = 1 — 1/fc. The sum of these two linear 
functions being negative, the derivative of fclogF, 1 logk (/i) is negative in the second interval and 

^ ' fc ' fc 

then fi = 1 — 1/k is also the niaxiniuni in the second interval. Summing up, T. i logfc (1 — 1/k) = 

'^ ' fc ' fc 

7ifc,r (1/^) is the maximum of F-, i logfc (fi) within [1/2, 1]. 

D 

3.1. A general upper bound for positive 1-in-fc-SAT. X is the random variable associating 
to each lk{m, n) the number of its solutions.. We have: 



-^SC)('^ 



k—^\ '"^ / \ '"' 

(l-^) j ^(^max(7u,.(^))j polvin) 



For the upper bound, we prove that for fc > 7 and r — log fc/fc, max^gjo,!] (7ifc {^)) < 1- This is 
the purpose of the following Fact. 

Fact 2. for k>7 , max^gjo^i] {'-fik,iog^ k/k i^)) < 1- 

Proof. We prove it first in the interval S E [1/2,1]. Both gi^ {SY and ^ J" i_^ are decreasing 

in S in this interval. 7ifc,r(l/2) = 2gi, (1/2)'' == 2 (^) °^ '' < 1 then 7u,iog^ fc/fc (-5) < 1 hi 
the interval S E [1/2, 1] . gi^. (5) = kS {1 — 5) increases from until 6 = 1/fc. In the same 

A 1 — (5 Si (Sy°^^ ''/'' 01 (l/fc)'°*^ '°^'° 

interval 5 (1-5) is decreasing. Then Tujog^ fc/fc (f^) = s'i^i-sf-^ ^ {i-i'/kY-^"'(i/kY"' - 



< 1 for fc > 7. 
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It remains to handle the function within the interval [1/fc, 1/2]. Since log -rj— — tts-^ is concave it can 



be bound from above by the line of slope its derivative 



< {-l + k)-^+^k. 



giAsy 



(-1 + fc)-i+*A:.gi, ((5)'°s''=/^ (_i + k^^+^kgi^ (l/fc)'°^' ''/'' is less than 1 within [l/k,s] where 
s = — log (1 — 1/fc) I (fc — 1) log (fc) — fc) /(fclog (fc — 1)). Finally, we bound from above loggia (S) 
by ^ogg'i^ (s) {S ~ s) + \oggi^ (s). The upper bound is less than 1 for fc > 7 in [s, 1/2]. D 

The application of Markov inequality finishes the proof of the upper bound. 

4. Discussion 

Any element of A/^. is necessary and sufficient to make the second moment method to be successful 
as stated by theoremfl] An interesting question that is raised by the fact that Aj^ may have many 
values is : what value gives the better lower bound? Precisely, is there a simple criterion that 
permits to select the 6 G A/^ that gives the best lower bound? 

In the example of Figure 111 the function g/j^, is represented for /13 = {1,8, 12}. It has three local 
maxima and so A/j^g — {61,62,63} with 3/13 ((52) < 5/13(^1) < 9ii3{63)- As said before, the second 
moment method succeeds only for those three values of 6. An immediate candidate for this choice 
of the best value could be 63 since it is the one for which the probability of satisfying a randomly 
selected constraint is maximum. In fact, the best lower bound is obtained using 62- The latter is 
the one that maximizes the first moment of Xs i.e. that corresponds to maxs<=Ai iliki^))- Since 
■yik{6) — 6~^ {1 — 6) ^ gik{6), the entropy term 6~^ {1 — 6) ^ centered on 1/2 tends to favor 
values of 6 near 1/2. We have verified this fact for many problems. We conjecture that for any 
problem defined by the set Ik, the best value of 6 for the second moment method is the 6* G A/^. 
that maximizes 7/^, (6) . 




Figure 1. An example of the functions gi^ and 7/j. for /13 = {1, 8, 12} for r = 0.64. 
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Appendix A. Proof of the equality (10) 
Proof. Considering ^ : 

= eeeC)C::)<«.(i-^) 

Recall that Kij^rf_5 (/i) = ((1 — /i) (5) (/i(5)* ■'~'' ((1 — (5 — /i(5)) *~'' and then 
Noting that K,,j^d,s {I - S) ^ 6'+^ (1 - sf""'"^ we get: 

^l,.. (1 " -5) ^6^^^ (1 - ^)^'=-^-^- (t) E. (^) Ct]) (-1 + ^^^ - ^%^) • 

Using the mean of the hypergeometric distribution of parameters k, i and j {J2d=o '^(d) {i-d)/{j) — 

jr) and Vandcrmonde identity, we get: 

<l>i,j (1 - '5) = S'^+^ (1 - Sf'-'-= (t) Q (-^ + ^±^j^ - .(.-^..A) -) ^ 

Denoting the quantity hj^ (d) — J2i^i H -j^^ (^ ^ ^) 






2 



G/.,^(i-^) = EE^"^-''(i~'^) 



2/c-j-j A\ fk\ I ij i+j - 2ij/k 5 {k - i - j + ij/k) 



■^r -^r JJ\JJ I fc<5 1-'^ (l-S) 

hjJSf 2hjA6)giAS)-2hjASf/k 
kS l-S 

s(kgl{6)-2hiJS)gi,id) + hj,{Sf/k 



2 



il-6f 



d{i-sy 

Noting that because of: 



aiki^) = T.('l)s'i^-^)"'C^-'hi)=-s'^'^iS)-kgiAS) 
iei ^ ^ ^ ^ 

hi^ (6) — k6gi^ {6) = dg'j^ (6) allowing for the desired relation. 



n 
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Appendix B. Proof of Lemma [3] 
Proof. The second derivative of log (F/^^ ,5^,. (/i)) is: 

(log (r..,,. (.)))" = (-1^-^-1^!^)+'^ (log (G... (.)))" 

5 25 S^ \ , G,,,,(/i)G";^,5(A^)-G;^,,(/i) 



\2 



It is easy to check that the second derivative of — j^ M _ ^_^_^ is negative and that its 

derivative tends to cxo when u tends to and to —00 on the other side then — ^-^ M _ — ° 

increases from —00 attains a maximum at a negative value then decreases to —00. Let —p (p > 0) 

be its maximum value. 

In the second part G/^.^^ (/z) is bounded and strictly positive. Indeed it is formed by a sum 

of positive terms some of which are strictly positive. Indeed all Kij^d,s (m) > for every /i € 

]0,mm (1, i^) [. Moreover, Hi^i^ij (0) > and if (5 < 1/2 Ki_i^o,<5 (1) > otherwise Ki^i^2i-k.s ((1 ^ (5) /S) > 

0. 

G/fc/ (/i) G" |5 (/x) — Gj g (/i) is a polynomial in /i. It is also bounded for /i e [0,rnm (l, ''^)]- 

The second part have no singular point and it it bounded. Le v be its maximum value. We prove 

that v > Q. We know thanks to Lemma l2| that G'j ^{1 — 5) =0. Moreover the second derivative 

G'j^ g [1 — 5) = i^/^_i) and as seen before Gj^^s (1 ^ ^) = 9ik i^) ■ We deduce that u > 0. Indeed, 



Finally 



^, ^ Gj,,s (1 - S) Gl,s jl^S)^ G'j^,s (1 - Sf ^ S^gl {5f ^ ^ 



5 25 5^ \^ GI,A^^)G'Ui^^)-G'I,,si^i)\ , 

' ' - 2 <-p + r.u 



l-/i p. l-5-5pJ Gi^jipY 

The second derivative if then negative over [Q,min (l, ■^^)] for every r < r*j = pjv. D 

Appendix C. Detailed proof of Lemma [T] 

Proof, p e [0, 1/2]: For p E [0, l]he second derivative of — logii/^ (p) is negative and so is the 
second derivative of log (fc — 1 — p). This permits to conclude that r^ (p) is decreasing, r^ (1/2) = 
log(2 fc - 3) - (2(fc - 2) log (fc))/(2fe - 3) > for every fc > 3 . So r^ (/i) > for /z € [0, 1/2] and 
then Ta {p) is strictly increasing in the same interval. 

It is easy to check that tq (0.15) — 2fc log I 7^^ logfc (];) J < for every fc > 3. Consequently 

logF^^ iio^fe (p) < 2 logK^^ logj, (i)J for every p e [0,0.15]. 

Similarly ro.15 (0.5) — 2fc log (7^^ logfc (1/fc)) < for any fc > 3. Concluding that 

in the interval [0, ^]. 

p e [1/2, 1[: The second derivative of — fclogii/j. {p) is — j4 k-i- ^- ^^ *^^^ ^^ checked 

easily that its third derivative is negative in [1/2, 1]. Then the first derivative of fclogii/^ {p) is 
concave. logGj^ 1 (p) have also the same properties. 
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The value of (— fclogti/j. (ii)) at the point ji ~ 1/2 is log (2fc — 3). The line joining the points 
{1 — l/k,0) to (l/2,log(2fc — 3)) bounds from below this first derivative. So (fclogti/fc (/z)) > 

'°!^?l"'^ (M-i + i/fc). 

2 + A: 

Similarly (logfclogGi^, i (a^))' > (.[^(^^y.) (m - 1 + 1/^)- 

Summing up (fc logT.^^ i i^ (m))' > ( (^^^|^^^f^ + ^^^^fff^) (m - 1 + lA) > for ^ e 

[1/2, 1— l/fc[. As a consequence: logFj^ i logfc (/j.) is increasing in this interval and then log Fj^ i logfc (fi) < 

r, 1 iogfc(l-l/fc)==2 1og7, iog.(l/ic)'for/xe [1/2,1 -l/fc[. 

'^ ' k ' k k 1 i^ 

We prove in the following that in this interval, fclogF, i logfc (n) is strictly decreasing. As already 

'-''• k- k 

seen the first derivative of —k\ogti/k (/i) is concave and can be bounded from above by its tangent 
in 1 — 1/k. Then (fclogti/^ (/i)) < ^-nrzi)2 (/^ ^ 1 + 1/^)- logfc logGj^^ i (/i) have also the same 

properties (logfc logGi^^i (m))' < f^ (a^ - 1 + 1/fc). 

Summing up (k logF^^ i log^ {^)\ < jj^^ f ^ - l") (^u - 1 + 1/fc) < for fc > 3. As a conse- 
quence: logF, 1 logfc (^) < F, 1 logfc (1 — 1/fc) — 2 log (7, logfc (1/fc)) for ,u e]l — 1/fc, 1]. D 

^ ' k ^ k "^ ' k ' k V'^'fc / 
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